32,435 research outputs found

    Facilitating translation using source language paraphrase lattices

    Get PDF
    For resource-limited language pairs, coverage of the test set by the parallel corpus is an important factor that affects translation quality in two respects: 1) out of vocabulary words; 2) the same information in an input sentence can be expressed in different ways, while current phrase-based SMT systems cannot automatically select an alternative way to transfer the same information. Therefore, given limited data, in order to facilitate translation from the input side, this paper proposes a novel method to reduce the translation difficulty using source-side lattice-based paraphrases. We utilise the original phrases from the input sentence and the corresponding paraphrases to build a lattice with estimated weights for each edge to improve translation quality. Compared to the baseline system, our method achieves relative improvements of 7.07%, 6.78% and 3.63% in terms of BLEU score on small, medium and largescale English-to-Chinese translation tasks respectively. The results show that the proposed method is effective not only for resourcelimited language pairs, but also for resource sufficient pairs to some extent

    Improved phrase-based SMT with syntactic reordering patterns learned from lattice scoring

    Get PDF
    In this paper, we present a novel approach to incorporate source-side syntactic reordering patterns into phrase-based SMT. The main contribution of this work is to use the lattice scoring approach to exploit and utilize reordering information that is favoured by the baseline PBSMT system. By referring to the parse trees of the training corpus, we represent the observed reorderings with source-side syntactic patterns. The extracted patterns are then used to convert the parsed inputs into word lattices, which contain both the original source sentences and their potential reorderings. Weights of the word lattices are estimated from the observations of the syntactic reordering patterns in the training corpus. Finally, the PBSMT system is tuned and tested on the generated word lattices to show the benefits of adding potential sourceside reorderings in the inputs. We confirmed the effectiveness of our proposed method on a medium-sized corpus for Chinese-English machine translation task. Our method outperformed the baseline system by 1.67% relative on a randomly selected testset and 8.56% relative on the NIST 2008 testset in terms of BLEU score

    Incorporating source-language paraphrases into phrase-based SMT with confusion networks

    Get PDF
    To increase the model coverage, sourcelanguage paraphrases have been utilized to boost SMT system performance. Previous work showed that word lattices constructed from paraphrases are able to reduce out-ofvocabulary words and to express inputs in different ways for better translation quality. However, such a word-lattice-based method suffers from two problems: 1) path duplications in word lattices decrease the capacities for potential paraphrases; 2) lattice decoding in SMT dramatically increases the search space and results in poor time efficiency. Therefore, in this paper, we adopt word confusion networks as the input structure to carry source-language paraphrase information. Similar to previous work, we use word lattices to build word confusion networks for merging of duplicated paths and faster decoding. Experiments are carried out on small-, medium- and large-scale English– Chinese translation tasks, and we show that compared with the word-lattice-based method, the decoding time on three tasks is reduced significantly (up to 79%) while comparable translation quality is obtained on the largescale task

    Source-side syntactic reordering patterns with functional words for improved phrase-based SMT

    Get PDF
    Inspired by previous source-side syntactic reordering methods for SMT, this paper focuses on using automatically learned syntactic reordering patterns with functional words which indicate structural reorderings between the source and target language. This approach takes advantage of phrase alignments and source-side parse trees for pattern extraction, and then filters out those patterns without functional words. Word lattices transformed by the generated patterns are fed into PBSMT systems to incorporate potential reorderings from the inputs. Experiments are carried out on a medium-sized corpus for a Chinese–English SMT task. The proposed method outperforms the baseline system by 1.38% relative on a randomly selected testset and 10.45% relative on the NIST 2008 testset in terms of BLEU score. Furthermore, a system with just 61.88% of the patterns filtered by functional words obtains a comparable performance with the unfiltered one on the randomly selected testset, and achieves 1.74% relative improvements on the NIST 2008 testset

    Design and analysis of a novel long-distance double tendon-sheath transmission device for breast intervention robots under MRI field

    Get PDF
    Cancer represents a major threat to human health. Magnetic resonance imaging (MRI) provides superior performance to other imaging-based examination methods in the detection of tumors and offers distinct advantages in biopsy and seed implantation. However, because of the MRI environment, the material requirements for actuating devices for the medical robots used in MRI are incredibly demanding. This paper describes a novel double tendon-sheath transmission device for use in MRI applications. LeBus grooves are used in the original transmission wheels, thus enabling the system to realize long-distance and large-stroke transmission with improved accuracy. The friction model of the transmission system and the transmission characteristics model of the novel tendon-sheath structure are then established. To address the problem that tension sensors cannot be installed in large-stroke transmission systems, a three-point force measurement method is used to measure and set an appropriate preload in the novel tendon-sheath transmission system. Additionally, experiments are conducted to verify the accuracy of the theoretical model and multiple groups of tests are performed to explore the transmission characteristics. Finally, the novel tendon-sheath transmission system is compensated to improve its accuracy and the experimental results acquired after compensation show that the system satisfies the design requirements
    corecore